Search CORE

254 research outputs found

Examining the validity of cross-lingual word sense disambiguation

Author: Hoste Veronique
Lefever Els
Publication venue
Publication date: 01/01/2011
Field of study

A classification-based approach to economic event detection in Dutch news text

Author: Hoste Veronique
Lefever Els
Publication venue: ELRA
Publication date: 01/01/2016
Field of study

Breaking news on economic events such as stock splits or mergers and acquisitions has been shown to have a substantial impact on the financial markets. As it is important to be able to automatically identify events in news items accurately and in a timely manner, we present in this paper proof-of-concept experiments for a supervised machine learning approach to economic event detection in newswire text. For this purpose, we created a corpus of Dutch financial news articles in which 10 types of company-specific economic events were annotated. We trained classifiers using various lexical, syntactic and semantic features. We obtain good results based on a basic set of shallow features, thus showing that this method is a viable approach for economic event detection in news text

Ghent University Academic Bibliography

Comparing learning approaches to language learning : there is more to it than 'bias'

Author: Daelemans Walter
Hoste Veronique
Publication venue: 'Ghent University'
Publication date: 01/01/2006
Field of study

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Comparing learning approaches to coreference resolution : there is more to it than 'bias'

Author: Daelemans Walter
Hoste Veronique
Publication venue
Publication date: 01/01/2005
Field of study

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Towards a balanced named entity corpus for Dutch

Author: Desmet Bart
Hoste Veronique
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2010
Field of study

Ghent University Academic Bibliography

Rude waiter but mouthwatering pastries! An exploratory study into Dutch aspect-based sentiment analysis

Author: De Clercq Orphée
Hoste Veronique
Publication venue: ELRA
Publication date: 01/01/2016
Field of study

The fine-grained task of automatically detecting all sentiment expressions within a given document and the aspects to which they refer is known as aspect-based sentiment analysis. In this paper we present the first full aspect-based sentiment analysis pipeline for Dutch and apply it to customer reviews. To this purpose, we collected reviews from two different domains, i.e. restaurant and smartphone reviews. Both corpora have been manually annotated using newly developed guidelines that comply to standard practices in the field. For our experimental pipeline we perceive aspect-based sentiment analysis as a task consisting of three main subtasks which have to be tackled incrementally: aspect term extraction, aspect category classification and polarity classification. First experiments on our Dutch restaurant corpus reveal that this is indeed a feasible approach that yields promising results

Ghent University Academic Bibliography

All mixed up? Finding the optimal feature set for general readability prediction and its application to English and Dutch

Author: De Clercq Orphée
Hoste Veronique
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2016
Field of study

Readability research has a long and rich tradition, but there has been too little focus on general readability prediction without targeting a specific audience or text genre. Moreover, though NLP-inspired research has focused on adding more complex readability features there is still no consensus on which features contribute most to the prediction. In this article, we investigate in close detail the feasibility of constructing a readability prediction system for English and Dutch generic text using supervised machine learning. Based on readability assessments by both experts and a crowd, we implement different types of text characteristics ranging from easy-to-compute superficial text characteristics to features requiring a deep linguistic processing, resulting in ten different feature groups. Both a regression and classification setup are investigated reflecting the two possible readability prediction tasks: scoring individual texts or comparing two texts. We show that going beyond correlation calculations for readability optimization using a wrapper-based genetic algorithm optimization approach is a promising task which provides considerable insights in which feature combinations contribute to the overall readability prediction. Since we also have gold standard information available for those features requiring deep processing we are able to investigate the true upper bound of our Dutch system. Interestingly, we will observe that the performance of our fully-automatic readability prediction pipeline is on par with the pipeline using golden deep syntactic and semantic information

Crossref

Ghent University Academic Bibliography

Optimization issues in machine learning of coreference resolution

Author: Hoste Veronique
Publication venue: Universiteit Antwerpen. Faculteit Letteren en Wijsbegeerte.
Publication date: 01/01/2005
Field of study

Ghent University Academic Bibliography

Mental distress detection and triage in forum posts: the LT3 CLPsych 2016 shared task system

Author: Desmet Bart
Hoste Veronique
Jacobs Gilles
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper describes the contribution of LT3 for the CLPsych 2016 Shared Task on automatic triage of mental health forum posts. Our systems use multiclass Support Vector Machines (SVM), cascaded binary SVMs and ensembles with a rich feature set. The best systems obtain macro-averaged F-scores of 40% on the full task and 80% on the green versus alarming distinction. Multiclass SVMs with all features score best in terms of F-score, whereas feature filtering with bi-normal separation and classifier ensembling are found to improve recall of alarming posts

Crossref

Ghent University Academic Bibliography

Cultivating trees: adding several semantic layers to the Lassy treebank in SoNaR

Author: Delaere Isabelle
Hoste Veronique
Monachesi Paola
Publication venue: LOT (Landelijke Onderzoekschool Taalwetenschap)
Publication date: 01/01/2009
Field of study

Ghent University Academic Bibliography